KeyGraph: Automatic Indexing by Co-Occurrence Graph Based on Building Construction Metaphor

نویسندگان

  • Yukio Ohsawa
  • Nels E. Benson
  • Masahiko Yachida
چکیده

In this paper, we present an algorithm for extracting keywords representing the asserted main point in a document, without relying on external devices such as natural language processing tools or a document corpus. Our algorithm KeyGraph is based on the segmentation of a graph, representing the co-occurrence between terms in a document, into clusters. Each cluster corresponds to a concept on which author's idea is based, and top ranked terms by a statistic based on each term's relationship to these clusters are selected as keywords. This strategy comes from considering that a document is constructed like a building for expressing new ideas based on traditional concepts. The experimental results show that thus extracted terms match author's point quite accurately, even though KeyGraph does not use each term's average frequency in a corpus, i.e., KeyGraph is a contentsensitive, domain independent device of indexing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mental Timeline in Persian Speakers’ Co-speech Gestures based on Lakoff and Johnson’s Conceptual Metaphor Theory

One of the introduced conceptual metaphors is the metaphor of "time as space". Time as an abstract concept is conceptualized by a concrete concept like space. This conceptualization of time is also reflected in co-speech gestures. In this research, we try to find out what dimension and direction the mental timeline has in co-speech gestures and under the influence of which one of the metaphoric...

متن کامل

Automatic graph construction of periodic open tubulene ((5,6,7)3) and computation of its Wiener, PI, and Szeged indices

The mathematical properties of nano molecules are an interesting branch of nanoscience for researches nowadays. The periodic open single wall tubulene is one of the nano molecules which is built up from two caps and a distancing nanotube/neck. We discuss how to automatically construct the graph of this molecule and plot the graph by spring layout algorithm in graphviz and netwrokx packages. The...

متن کامل

CO-graph: A new graph-based technique for cross-lingual word sense disambiguation

In this paper, we present a new method based on co-occurrence graphs for performing Cross-Lingual Word Sense Disambiguation (CLWSD). The proposed approach comprises the automatic generation of bilingual dictionaries, and a new technique for the construction of a co-occurrence graph used to select the most suitable translations from the dictionary. Different algorithms that combine both the dict...

متن کامل

Graph-based Word Clustering using a Web Search Engine

Word clustering is important for automatic thesaurus construction, text classification, and word sense disambiguation. Recently, several studies have reported using the web as a corpus. This paper proposes an unsupervised algorithm for word clustering based on a word similarity measure by web counts. Each pair of words is queried to a search engine, which produces a co-occurrence matrix. By cal...

متن کامل

Alleviating Search Uncertainty Through Concept Associations: Automatic Indexing, Co-Occurrence Analysis, and Parallel Computing

In this article, we report research on an algorithmic apgather, process, and retrieve information. These systems proach to alleviating search uncertainty in a large inforprovide a wide variety of information and services, rangmation space. Grounded on object filtering, automatic ing from daily updates of foreign and national news, indexing, and co-occurrence analysis, we performed a movie revie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998